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Random teleportation is a necessary evil for ranking and clustering directed networks based on 
random walks. Teleportation enables ergodic solutions, but the solutions must necessarily depend on 
the exact implementation and parametrization of the teleportation. For example, in the commonly 
used PageRank algorithm, the teleportation rate must trade off a heavily biased solution with a 
uniform solution. Here we show that teleportation to links rather than nodes enables a much 
smoother trade-off and effectively more robust results. We also show that, by not recording the 
teleportation steps of the random walker, we can further reduce the effect of teleportation with 
dramatic effects on clustering. 



Introduction 

Random walks play a preponderant role in network 
theory [1] and are at the heart of popular metrics mea- 
suring the effect of network topology on patterns of flows 
through the nodes. Defined as the expected density of 
random walkers on a node at stationarity, PageRank pro- 
vides a non-local measure of centrality and is perhaps 
the most important and influential application of ran- 
dom walks [2]. First introduced to rank pages on the 
Web, PageRank [2, 3], or variations of it [4-6], has now 
been adopted to rank the importance of nodes in a broad 
range of systems, e.g., in citation networks [7, 8], food- 
webs [9], and sports [10]. Similarly, in the field of commu- 
nity detection, more and more methods are based on the 
notion that networks often describe systems character- 
ized by flow and the intuitive idea that random walkers 
should be trapped for long times in good communities. 
This idea led to the design of quality functions for net- 
work partitioning such as the so-called map equation [11] 
or stability [12], which naturally take into account the 
constraints imposed by network topology on dynamical 
processes. 

Random walk-based methods are appealing because of 
their nice mathematical properties, their ability to ex- 
plore the system at multiple scales, and their intuitive 
interpretation of how real flows of people, money, infor- 
mation, etc. take place in empirical networks [13, 14]. 
However, most methods suffer from an important draw- 
back: they are defined only at stationarity, a state that is 
either trivial, non-uniquely defined, or never reached in a 
majority of empirical systems. To circumvent this prob- 
lem, mathematical tricks have been proposed to make 
the dynamics ergodic, even when the underlying network 
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(a) Recorded node teleportation (b) Recorded link teleportation 




(c) Unrecorded node teleportation (d) Unrecorded link teleportation 




FIG. 1: .pdfCommon and smart teleportation in networks. 

(a) Recorded node teleportation is the commonly used tele- 
portation scheme. Both steps along links and teleportation 
steps contribute to node visit rates for ranking and transition 
rates for clustering, and nodes are the targets of teleportation. 

(b) In recorded link teleportation, all steps contribute and 
links are the targets of teleportation. (c) In unrecorded node 
teleportation, only steps along links (solid lines and filled cir- 
cles) contribute, and not those due to teleportation (dashed 
line and open circle), (d) In unrecorded link teleportation, 
only steps along links contribute and links are the targets of 
teleportation. 



is not strongly connected. The most prominent proce- 
dure allows walkers to randomly teleport across the sys- 
tem, and thus to occasionally free themselves from the 
actual topology. Unfortunately, teleportation brings its 
own share of problems. For example, with teleportation, 
the ranking of nodes or their clustering into communities 
depend not only on the topological properties of the sys- 
tem, but also on the exact implementation of the artificial 
teleportation process. 

The goal of this paper is to propose and evaluate dif- 
ferent ways to minimize the effect of teleportation on 
random walk-based metrics and methods. To do so, we 
explore two different but related possibilities for smart 
teleportation. In order to make rankings more robust, 
our first approach modifies the targets of teleportation 
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steps, the so-called preference vector. In order to make 
clusterings more robust, our second approach modifies 
which steps contribute to transition rates between nodes 
and only counts steps along links and not teleportation 
steps. 

Figure 1 describes the different teleportation schemes. 
In general, the probability of landing on a node after a 
teleportation depends on some of the topological prop- 
erties of the network. Standard teleportation, which we 
call recorded node teleportation, is recovered when the 
preference vector is uniform, i.e., the probability to land 
on each node is the same (see Fig. 1(a)). In this paper, 
for ranking we argue for the use of recorded link telepor- 
tation, where the preference vector is proportional to the 
in-strength of the nodes (see Fig. 1(b)), and is equivalent 
to teleporting to links instead of nodes. For clustering 
we argue for the use of teleportation without recording. 
No teleportation steps are recorded when walkers tele- 
port uniformly to nodes in unrecorded node teleportation 
(see Fig. 1(c)) and to nodes proportionally to their out- 
strength in unrecorded link teleportation Fig. 1(d)). 

The difference between the recorded and unrecorded 
schemes stems from the fact that only steps along links 
contribute to transition rates between nodes, which we 
will show to be crucial for improving community detec- 
tion. Below, we study the mathematical relations and 
differences between the four teleportation schemes illus- 
trated in Fig. 1. We show that the incorporation of ap- 
propriate topological elements into teleportation leads to 
desirable properties in different limit scenarios, and pro- 
vides an interesting connection between local and non- 
local centrality measures. Numerical simulations also 
confirm that the effect of teleportation on ranking and 
clustering can be significantly reduced, with important 
applications for mining the large-scale organization of 
complex networks. 

Mathematics of teleportation 

We focus on weighted and directed networks described 
by the N x N adjacency matrix Wij, where N is the 
number of nodes in the system and is the weight of 
the link from j to i. The total in- and out-strengths of 
node i are defined as wf 1 = J2j Wij and w° ut — ^ • Wji, 
respectively. The total weight of all links W is given 
by W — J2i w i Ut — Si^i"- I n tnc casc 01 unweighted 
networks, the adjacency matrix is equal to 1 if there is 
a link going from j to i and otherwise. Moreover, wf 1 
and wf 1 correspond to the in- and out-degrees of node i. 

Standard teleportation 

The dynamical properties of an unbiased random 
walker on a network are determined by the spectral prop- 
erties of the transition matrix = Wij/w° ut , which 
drives the time-evolution of the expected density pi of 



walkers at node i 

p i; t+i=Y^ T ijPj;t- (!) 

j 

The steady-state density of walkers is given by the domi- 
nant eigenvector of Ty, denoted by 7Tj, which defines the 
PageRank of node i. Asymptotic convergence towards 
this solution and its uniqueness are ensured only if the 
network is strongly connected and aperiodic, a situation 
that rarely occurs in empirical networks. In order to reg- 
ularize this situation, several tricks have been proposed 
in the literature, the most common being to allow for 
teleportations through the network. In its simplest in- 
stance, walkers either follow links with probability a or 
teleport to a random location with probability 1 — a. 

Random walks with teleportation are driven by the 
rate equation 

Pi-t+i = a^TijPj-t + (1 - a)Vi, (2) 

3 

where the preference vector subject to the constraint 
"^2 Vi — 1, determines the frequency at which walkers 
teleport to node i. In general, the random process (2) 
converges towards a unique steady-state solution for any 
a < 1. Moreover, the stationary solution of (2) is a 
function of Vi and of the teleportation probability 1 — a, 
formally given by 

n i;a = (l-a)J2(I-aT)^ Vj (3) 

3 

where the dependence on a has been made explicit. This 
solution can be Taylor expanded in terms of a to provide 
the expression [15, 16]: 

oo 

Ti;a=« i + X;« fc E(^- 2 S" 1 )^ W 
fe=l j 

an expression that clearly shows the non-local nature of 
PageRank, as it is made of terms associated with paths 
of any length k in the network. 

In the above expressions, we have implicitly assumed 
that each node has at least one outgoing link, such that 
w° nt > 0, Vi, and that the transition matrix preserves 
probability, i.e. J2i Tij = 1. In systems where this condi- 
tion is not fulfilled, it is usual to impose a teleportation 
step every time a walker arrives on a dangling node j 
without out-links. Mathematically, this corresponds to 
replacing the j th column of Ty, only made of zeros, by 
the preference vector v^. For the sake of simplicity, but 
without loss of generality, in the following mathematical 
analysis we will assume that the system does not contain 
dangling nodes. 

Limitations of standard teleportation 

Random walks with teleportation have the advantage 
of making the dynamics ergodic and thus ensure the ex- 
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istence of a well-defined, asymptotic, steady-state solu- 
tion. However, due to its artificial nature and the extra 
parameter, the teleportation process also raises a series 
of fundamental questions [15]. While a random walk is 
a good proxy for diffusion in a broad range of networked 
systems, teleportation can only be viewed as a mathe- 
matical trick in the absence of real-world interpretation. 
Moreover, even when such an interpretation is plausible, 
e.g., for individuals browsing the Web and occasionally 
jumping to a new page without following a hyper- link, 
selecting a proper value of a and an expression for is 
problematic. 

Most research tends to overlook these issues and use 
the standard value a — 0.85 and the uniform preference 
vector Vi = 1/N, i.e., a walker randomly teleports on 
any node, independently of any intrinsic or topological 
properties. This choice of preference vector leads to the 
recorded node teleportation illustrated in Fig. 1(a). Yet 
it has been shown that the stationary solution Hi can 
radically change when a is modified [17-19]. This de- 
pendence is clear when rewriting the formal solution (4) 
with Vi = l/N 



k=l I j \ 1 / 

(5) 



The leading contribution for small a makes PagcRank 
uniform, thereby diluting the structural differences be- 
tween nodes, whereas differentiation emerges when a is 
increased. The contribution of each path of length k 
is proportional to the difference between in- and out- 
strengths around links, i.e., w\ n — w° ut . If the network 
is strongly connected, it is instructive to note that all 
of these contributions vanish only when the network is 
regular, i.e., w™ — w° ut — W/N, Vi. This version of 
PageRank is thus expected to depend on a except in this 
trivial case. For these reasons, it is important to improve 
our understanding of the sensitivity due to teleportation 
and to identify adequate values of a. So far, the rule 
of thumb has been to choose values close to 1, in order 
to minimize the effect of teleportation on the random 
walk process, but not too close, because calculations be- 
come prohibitively expensive and unstable in this limit. 
Similarly, the importance of the preference vector Vi is 
ignored in a majority of studies, despite the fact that, 
in general, no individual choice is better than any other 
one, and that different choices seem more realistic in dif- 
ferent types of systems [9, 20]. For instance, in systems 
where the size of the nodes is heterogeneous, e.g., sci- 
entific journals publish different numbers of articles, the 
preference vector can be chosen proportional to the size 
of the nodes [7] . 



Smart Teleportation 

Teleportation can be seen as a mean-field process 
where walkers jump towards any node i with a probabil- 
ity Vi, independently of the underlying network topology. 
Our aim is to reduce the noise induced by teleportation 
in order to produce a more faithful description of the 
system, and to minimize its dependence on the value of 
a. 



Recorded link teleportation 

Our first approach takes advantage of the ability to 
choose an appropriate preference vector to improve the 
stability of 7Tj ;a . In the PagcRank literature, the prefer- 
ence vector Vi has been introduced as a way to incorpo- 
rate non-structural properties into the algorithm and to 
fine-tune PageRank to the particular taste or interest of a 
user. Here we propose instead to select a preference vec- 
tor based on topological properties of nodes, with the aim 
of minimizing the effect of teleportation on dynamics. In 
the ideal case of a strongly connected and aperiodic net- 
work, an appealing solution is to take Vi proportional to 
7Tj, solution of Eq. 1. In that case, Tti is well defined and 
it is easy to show that 7Tj ;Q , = 7Tj for all values of a. The 
question of picking a particular value of a thus becomes 
unimportant. 

The previous example is trivial because teleportation 
is not necessary to make the original dynamics ergodic. 
Nonetheless, it provides useful hints on how to address 
the general problem: one should aim for a preference 
vector likely to be close to 7r». To do so, we propose the 
use of 



inspired by the observations that in-strength is statisti- 
cally correlated to PageRank in random networks and 
that both quantities are equivalent, up to an additive 
constant, in the mean-field approximation [21, 22]. This 
process is equivalent to selecting a link at random pro- 
portionally to its weight during teleportation, hence the 
notation recorded link teleportation. 

Introducing (6) into the formal solution (4) leads to 
the expression 

which differs from (5) in several ways. At zeroth order, 
PageRank for recorded link teleportation is not uniform 
anymore, and it is simply given by in-strength, which 
is itself a standard and widely-used centrality measure. 
fc t/l -order contributions are made of a weighted average 
of contributions at path of length k. The contribution of 
each node, instead of each link for (5), is the difference 
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between its in-strength and its out-strength, w™ — w° ut . 
As expected, nodes concentrating the flow of probabil- 
ity, to™ > w° ut give a positive contribution, while nodes 
diluting this flow, w™ < id™', give a negative contri- 
bution. Equation (7) thus interpolates between local 
and non-local centrality measures when tuning a. By 
construction, (7) also has the interesting property that 
the PageRank vanishes for leaves, i.e., nodes whose in- 
strength is equal to zero, for any a < 1, in agreement with 
PageRank's original philosophy that votes come from in- 
neighbours. 

Recorded link telcportation offers a range of interesting 
mathematical properties that make it an ideal candidate 
for our purpose. Contrary to recorded node teleportation 
(5), in which PageRank depends on a except in trivial 
situations, (7) has the advantage of being independent of 
a when the network is undirected or when the network 
is Eulerian (w\ n = w° ut , Vi), as (7) obviously reduces to 
7Tj = w^/W in those cases. Additionally, it is straightfor- 
ward to show that PageRank is also given by 7Tj = wf 1 /W 
for any a in the mean-field approximation, where the ad- 



jacency matrix takes the form Wij 
be checked from Eq. (2) 



w t vn 



t /W, as can 
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(8) 



This result is expected to hold in large, well-mixed net- 
works, where mean- field approximations are known to 
provide reasonable predictions. Taken together, these 
results thus suggest that recorded link teleportation, by 
blending the directed and the undirected solutions in- 
stead of trading off the directed solution with the uni- 
form solution, provides rankings that are more robust to 
the exact value of the teleportation rates. 



Unrecorded teleportation 

Despite its apparent success at minimizing the effects 
of teleportation on the value of PageRank, recorded link 
teleportation suffers from an important limitation: all 
transitions are treated equal. This property has un- 
wanted consequences when using random walks to un- 
cover communities in a network, as teleportation tends 
to create artificial connections between nodes in different 
communities, and thus to water down structures present 
in the system. In order to circumvent this limitation, wc 
propose the concept of unrecorded teleportation, where 
only steps along links are considered when performing a 
measure of the network [7, 23] . 

The stationary solution of a random walk with un- 
recorded teleportation process can easily be calculated 
by applying an extra step without teleportation to the 
solution for the corresponding recorded telcportation 
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which leads to the expression 

<" roC = (!-«)£ Tu (10) 
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and to the Taylor expansion 
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the behavior of which depends on the choice of Vj . In the 
following, we consider two versions of unrecorded tele- 
portation. 

In the first version, shown in Fig. 1(c), called un- 
recorded node teleportation, the preference vector is uni- 
form. Unfortunately, this version does share the afore- 
mentioned robust properties of recorded link teleporta- 
tion for PageRank. It is nonetheless interesting to note 
that the leading contribution for small values of a is given 

by 



J2T ij + 0(a), 



j 



(12) 



which simply counts the weight of incoming links normal- 
ized by the out-strength of the neighbor. This centrality 
measure finds potential applications in bibliometrics, as 
it takes into account the variability in the number of ref- 
erences (out-links) per article, and should facilitate com- 
parisons of scientific journals and authors across scientific 
fields [24]. 

Unrecorded link teleportation, shown in Fig. 1(d), 
is defined by a preference vector proportional to out- 
strength. This choice has the advantage of effectively 
leading to a sampling of the links proportionally to their 
weight in the network. Indeed, selecting a node with a 
probability proportional to its out-strength and following 
one of its links before recording is equivalent to selecting 
a link at random in the network. This equivalence en- 
sures that the stationary solutions of random walks with 
recorded link teleportation and with unrecorded link telc- 
portation are identical, and are given by Eq.(7). Un- 
recorded link teleportation thus presents the same ro- 
bustness, e.g., independence of PageRank on a for undi- 
rected networks, in the mean-field approximation, etc. 
As we will see in simulations in the next section, un- 
recorded link telcportation has the further advantage 
of stabilizing the outcome of community detection algo- 
rithms applied to real and artificial benchmark networks. 



Smart teleportation and clustering 

To explain why unrecorded telcportation gives much 
more robust partitions than recorded teleportation, wc 
have partitioned unweighted directed benchmark net- 
works with known partitions and tunable module mix- 
ing rates \x [25]. Figure 2 shows that Infomap either 
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(a) Recorded teleportation (b) Unrecorded teleportation 
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FIG. 2: .pdfRobust clustering with unrecorded teleportation 
in directed benchmark networks. In the blue region, the 
benchmark solution with multiple modules minimizes the map 
equation for recorded (a) and unrecorded teleportation (b). 
In the gray region, the one-module solutions is optimal for 
recorded (a) and unrecorded teleportation (b). We used In- 
fomap to minimize directed LFR benchmark networks with 
1,000 nodes and 7,500 links with between 20 and 50 nodes in 
the communities [25] . Results do not depend on the teleporta- 
tion target in the benchmark networks without degree-degree 
correlations and with uniform out-degree. 



finds the benchmark solution or leaves the benchmark 
unpartitioned in one single module for all teleportation 
schemes. Since the random walker movements between 
modules are only marginally affected by the target of 
teleportation in the benchmark networks without degree- 
degree correlations and with uniform out-degree, results 
obtained from link teleportation and node teleportation 
are practically the same. But clustering obtained with 
unrecorded or recorded teleportation makes all the dif- 
ference as shown in Fig. 2. If teleportation steps are 
not encoded, the results become independent of telepor- 
tation rate for a given module mixing and Infomap re- 
covers the benchmark solution for all module mixings up 
until fi — 0.7. At the same mixing rate, with 70 percent 
of each node's links connecting to nodes outside its own 
cluster, recorded teleportation hits the limit for which 
no low teleportation rate can generate the benchmark 
solution. Above module mixing rate /i = 0.7, the com- 
munity structure of the benchmark network is smeared 
out by any teleportation rate. For module mixing rates 
approaching zero, Infomap can recover the benchmark 
solutions at increasing teleportation rates. 

Figure 3 explains what partition Infomap will find at 
different module mixing and teleportation rates. As the 
figure shows, Infomap finds the benchmark partition as 
long as the benchmark partition provides a shorter de- 
scription length than the unpartitioned network. The 
sharp transition from the benchmark solution with mul- 
tiple modules to the unpartitioned one-module solution 
happens without any intermediate solutions at telepor- 
tation rate 1 — a = 0.6 for module mixing rate /i = 0.2, 
as shown in Figs. 3(a) and (c), and at teleportation rate 
1 — a — 0.2 for module mixing rate fi = 0.6, as shown in 
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FIG. 3: .pdfUnrecorded teleportation strongly reduces the 
influence of the teleportation rate. The top panes show the 
normalized mutual information between the benchmark par- 
tition at low module mixing rate (a) and high module mixing 
rate (b), and partitions generated by Infomap for four differ- 
ent teleportation schemes: without encoding of teleportation 
steps to links (Unrec link) and nodes (Unrec node) and with 
encoding of teleportation steps to links (Rec link) and nodes 
(Rec node). The bottom panes show the codelength of the 
map equation for different partitions of benchmark networks 
at low module mixing rate (c) and high module mixing rate 
(d): the codelengths associated with unpartitioned networks 
(with and without encoding of teleportation steps) and the 
codelengths associated with the benchmark partition and the 
partition generated by Infomap (with and without encoding 
of teleportation steps) . Results are based on same data as in 
Fig. 2. 



Figs. 3(b) and (d). Consequently, for recorded, telepor- 
tation the total module mixing from links and telepor- 
tation determines which solution provides the shortest 
description of the random walker on the network. For un- 
recorded teleportation, however, the codelength becomes 
almost independent of teleportation rate and the module 
mixing from links determines the clustering result. 

In the next section, we will demonstrate the advantages 
of using smart teleportation over standard teleportation 
for ranking and clustering real-world networks. Contrary 
to the benchmark networks analyzed above, real-world 
networks have modules with varying degree of mixing. 
Therefore, we will not see the sharp transition from a 
single multi-module solution to the unpartitioned one- 
module solution. Instead, we will see a gradually decreas- 
ing normalized mutual information as the increased tele- 
portation rate smears out the boundaries of weak clus- 
ters. 
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Smart teleportation in real-world 
networks 



Ranking of scientific journals 

In this section we explore the effect of teleportation on 
ranking and clustering in real- world networks. We begin 
with an illustrative example of ranking. We ranked 7,940 
journals connected by 9.2 million citations aggregated in 
1.2 million weighted links [26] with the four different tele- 
portation schemes for different teleportation rates 1 — a 
and reported the node visit rates for five top journals: 
Nature, Science, Proceedings of the National Academy of 
Sciences (PNAS), The Journal of Biological Chemistry 
(JBC), and Physical Review Letters (PRL) (see Figure 
4). The choice of teleportation scheme affects not only 
the absolute node visit rates, but also the relative node 
visit rates between the journals, and therefore also the 
rank order. Teleportation to links, whether recorded or 
unrecorded, dramatically reduces its undesirable damp- 
ing effect. Ranking with link teleportation is less sensi- 
tive to the choice of teleportation rate, but the rank order 
nevertheless depends on whether only the local neighbor- 
hood of a node for high teleportation rates or the entire 
network for low teleportation rates is considered. For ex- 
ample, Figs. 4(b) and 4(d) show that Nature ranks higher 
than The Journal of Biological Chemistry only when the 
entire network structure is taken into account. 



(a) Recorded node teleportation 



(b) Recorded link teleportation 
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FIG. 4: .pdfLink teleportation reduces the influence of tele- 
portation rate on top-ranked scientific journals. Reported in 
percent, the journal visit rates obtained with recorded node 
teleportation in (a), recorded link teleportation in (b), un- 
recorded node teleportation in (c), and unrecorded link tele- 
portation in (d). 



Similarly, we find that the choice of teleportation 
scheme dramatically affects the clustering results. For 
example, when we clustered the scientific journals with 
the Infomap method [11], we obtained non-trivial solu- 
tions for teleportation rates below 50 percent for recorded 
node teleportation and below 75 percent for recorded link 
teleportation. For unrecorded teleportation, however, we 
obtained non-trivial solutions for all teleportation rates. 



Data description 

To quantitively compare the ranking and clustering re- 
sults between standard and smart ranking in a more sys- 
tematic way, we analyzed eight real- world networks. We 
selected networks of widely different sizes, topologies, and 
origins: 

Coathorship is a weighted undirected network included 
for reference that describes more than 2,500 coauthor- 
ships between about 500 network scientists [23]. 

US airports is a weighted directed network that de- 
scribes about 18,000 connections weighted by passen- 
ger flow between close to 500 airports in the US in 2007 
[23]. 

US political blogs is an unweighted directed network 
that describes about 19,000 hyperlinks between almost 
1,500 blogs blogs on US politics collected in 2005 [27]. 

Sure political blogs is a weighted directed network that 
describes about 13,000 connections between more than 
1,000 political blogs in Sweden in 2010 [23]. 

Journal citations is a weighted directed network that 
describes more than a million connections formed by 
around 10 million citations between close to 8,000 sci- 
entific journals in 2007 [28]. 

Call graph is a weighted directed network that de- 
scribes more than 7,000 calls between about 2,500 func- 
tions in the cross-platform library GLib [23]. 

Stanford web is a directed network that describes 2.3 
million hyperlinks between almost 300,000 web pages 
in the domain stanford.edu [29]. 

Google web is a directed network that describes 5.1 mil- 
lion hyperlinks between more than 700,000 web pages 
from the Google Programming Contest in 2002 [29]. 

For each network, we analyzed ranking and clustering 
robustness of the four different teleportation schemes de- 
picted in Fig. 1: recorded teleportation to nodes and to 
links and unrecorded teleportation to nodes and to links. 
We quantified the robustness of the results to variations 
in the teleportation rate by measuring the similarity be- 
tween results obtained at the commonly used teleporta- 
tion rate 1 — a — 0.15 with results obtained at lower and 
higher teleportation rates. 
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FIG. 5: .pdfRobust rank size with smart teleportation in real-world networks. We measured the cosine similarity between the 
node rank sizes obtained at teleportation rate 1 — a — 0.15 and the node rank sizes obtained at lower and higher teleportation 
rates. When teleportation steps were included in the node visit rates of the random walker, teleportation to links (Rec link) 
is more robust than uniform teleportation to nodes (Rec node). When teleportation steps were not included in the node visit 
rates, the rank size is overall more robust and link teleportation (Unrec link) is again more robust than node teleportation 
(Unrec node). Note that recorded and unrecorded link teleportation by definition give the same rank size. See main text for 
details about the networks. 



Robust ranking 

We used the power iteration method to derive the 
node visit frequencies for the four different teleportation 
schemes. When located on a node with fc° ut = 0, a ran- 
dom walker automatically performs a teleportation, as in 
the original formulation PageRank. To obtain the node 
visit frequencies for the unrecorded teleportation scheme, 
we first calculated the node visit frequencies with the 
recorded teleportation scheme and then performed an ex- 
tra step without teleportation followed by normalization. 
We are interested in the robustness of both the node rank 
sizes and the node rank order. There are several ways to 
measure the similarity between the sizes and orderings of 
two node rankings, but we opted for two simple metrics. 

For rank size comparisons between different node visit 
rates 7r.; ;:E and TTi- y obtained by different teleportation 
rates 1 — a x and 1 — a v , we used the commonly used 
cosine similarity 

s= ^ n ^y . (13) 

Figure 5 shows the cosine similarity measured between 
the node rank sizes obtained at teleportation rate 1 — a = 
0.15 and the node rank sizes obtained at lower and higher 
teleportation rates. As for the top-ranked journals in 



Fig. 4, for all teleportation schemes the results depend on 
the teleportation rate and the similarity is only perfect 
at the reference teleportation rate 1 — a = 0.15. But for 
all networks, link teleportation is equally or more robust 
than node teleportation, and unrecorded teleportation is 
equally or more robust than recorded teleportation. The 
commonly used recorded teleportation to nodes is by far 
the least robust teleportation scheme. 

For comparing different node rank orders, we measured 
the mutual information between node-pair comparisons 
sampled from the rankings. That is, we sampled pairs 
i,j of nodes proportional to the node rank sizes 7Tj. x , nj ;x 
from the ranking obtained at teleportation rate 1 — a x 
and measured the reduction of uncertainty about which 
of the two nodes X={i,j} has the highest rank after 
observing the order Y of the other ranking obtained at 
teleportation rate 1 — a y . In general, with joint proba- 
bility distribution p(x, y) and marginal probability dis- 
tributions p(x) = Y, y p(x,y) and p(y) = J2 x p( x 'V)> tne 
mutual information is given by 



8 



(a) Coauthorship 

1 i» MW »» M »» M" « f 



US airports 



(c) US political blogs 
1 




(d) Swe. political blogs 
1 




0.2 0.4 0.6 0.8 1 
Teleportation rate 

(e) Journal citations 
1 

0.8 



0.2 0.4 0.6 0.8 1 
Teleportation rate 



0.2 0.4 0.6 0.8 1 
Teleportation rate 



0.2 0.4 0.6 0.8 1 
Teleportation rate 



Google web 




0.2 0.4 0.6 0.8 1 
Teleportation rate 



0.2 0.4 0.6 0.8 1 
Teleportation rate 



0.2 0.4 0.6 0.8 1 
Teleportation rate 



0.2 0.4 0.6 0.8 1 
Teleportation rate 



FIG. 6: .pdfRobust rank order with smart teleportation in real-world networks. We measured the mutual information between 
the node rank order obtained at teleportation rate 1 — a = 0.15 and the node rank order obtained at lower and higher tele- 
portation rates. In general, there is no advantage in not counting teleportation steps (Unrec) over counting teleportation steps 
(Rec), but link teleportation (link) is again more robust than node teleportation (node). Note that recorded and unrecorded 
link teleportation by definition give the same rank order. See main text for details about the networks. 



With the unit step function 

e(z) = 



1 if z > 
if z < 



(15) 



the joint probability of, for example, X — i and Y = j, 
is 



The factor TTi- x iTj ;x , obtained by picking nodes propor- 
tional to their visit frequencies, guarantees that the order 
between highly ranked nodes weighs higher in the com- 
parison. If one ranking provides no information about 
the other ranking, one bit of information would be nec- 
essary to determine which of two nodes is the one with 
the highest rank. Therefore, the mutual information can 
not be larger than one bit. But because some pair of 
nodes in general can have the same rank, we normalize 
the mutual information by dividing by the maximum en- 
tropy of X and Y. With the entropy given by 



the normalized mutual information takes the form 

I(X;Y) 



R = 



max (H(X),H{Y))' 



(17) 



(18) 



We normalize by dividing by the maximum entropy of X 
and Y rather than the commonly used average to avoid 
rewarding simplistic solutions with many or all nodes of 
equal rank. Figure 6 shows the normalized mutual infor- 
mation between the node rank order obtained at telepor- 
tation rate 1 — a = 0.15 and the node rank obtained at 
lower and higher teleportation rates. For all networks, 
rank order is the same for unrecorded and recorded tele- 
portation when teleporting to links as shown in Fig. 1. 
The rank order generated by node teleportation is more 
robust in the strongly directed call graph, but more often 
the rank order generated by link teleportation is more ro- 
bust. For example, all link teleportation rates generate 
the same rank order in the undirected coauthorship net- 
work, whereas the rank order is influenced by node tele- 
portation rates. Teleportation to links can take advan- 
tage of possible bidirectional connections between nodes. 



Robust clustering 

When clustering nodes in networks based on random 
walks, not only the node visit rates, as for ranking, but 
also the transition rates between nodes affect the re- 
sult. Therefore, and as the example with scientific jour- 
nals above demonstrates, the teleportation scheme and 
teleportation rate have dramatic effects on clustering. 
To quantitatively compare the teleportation schemes, 
we clustered the eight real-world networks with the 



9 



c (a) Coauthorship (b) US airports (c) Political blogs (d) Swe. political blogs 

o 




Teleportation rate Teleportation rate Teleportation rate Teleportation rate 



c (e) Journal citations (f) Call graph (g) Stanford web (h) Google web 

o 




Teleportation rate Teleportation rate Teleportation rate Teleportation rate 



FIG. 7: .pdfRobust clustering without encoding of teleportation steps in real-world networks. We measured the mutual 
information between the obtained partitions at teleportation rate 1 — a = 0.15 and the obtained partitions at lower and higher 
teleportation rates. Not encoding teleportation steps (Unrec) is always better than encoding teleportation steps (Rec). The 
robustness of teleportation to nodes (node) or links (link) depends on the network. Each data point corresponds to the average 
over 100 pairwise comparisons between partitions generated with the Infomap method [11]. See main text for details about the 
networks. 



information-theoretic clustering method Infomap [11]. 
Infomap searches for the network partition that mini- 
mizes the description length of a random walker guided 
by the links of the network. By altering the dynamics of 
the random walker, for example, by altering the telepor- 
tation rate, Infomap may consequently identify different 
partitions. To test how much the partitions change when 
the teleportation rate is altered, we used the normalized 
mutual information applied to cluster comparisons [30]. 
In this way, we can compare the robustness associated 
with the different teleportation schemes. 

The mutual information between two network parti- 
tions measures how much we learn about one network 
partition by studying the other one. We always used 
the network partition obtained at the commonly used 
teleportation rate 1 — a = 0.15 as reference. To avoid 
undesirable effects that singletons can cause, we sampled 
the nodes proportionally to their visit frequencies rather 
than uniformly when calculating mutual information. In 
this way, we also put emphasis on correct assignments 
of important nodes. For fair comparisons between par- 
titions obtained at different teleportation rates, we used 
the node visit frequencies of the reference partition. By 
normalizing by the maximum entropy of the two parti- 
tions rather than the average of the two, we naturally 
penalize for overfitting and avoid rewarding for underfil- 
ling. 

With s x for the total visit rate of all nodes in module x 



and s xy for the total visit rate of all nodes that are jointly 
partitioned in module x and module y, the entropy takes 
the form 

H(X) = -Y,s x \ogs x , (19) 

X 

the mutual information 

I(X;Y)=Y / s xy \og^, (20) 

S X Sy 

x,y y 

and the normalized mutual information as in Eq. (18). 

Figure 7 shows that unrecorded teleportation gives 
more robust clustering for all tested networks and, in 
general, link teleportation gives more robust results than 
node teleportation. Recorded teleportation gives robust 
results in a window around teleportation rate 1 — a = 
0.15, but the normalized mutual information quickly 
drops to zero outside this window. Contrarily, for un- 
recorded teleportation, the normalized mutual informa- 
tion stays relatively high for all values of the teleportation 
rate. 

Conclusions 

When ranking and clustering nodes in networks, we 
have demonstrated analytically and numerically that 



10 



we can drastically reduce undesirable and parameter- 
dependent effects of standard teleportation with un- 
recorded teleportation to links. Because this smart tele- 
portation scheme takes advantage of the topology of the 
network — blending the directed and the undirected so- 
lutions instead of trading off the directed solution with 
the uniform solution — results are more robust to the 
exact value of the teleportation rates. In particular, we 
have shown analytically that ranking results are exact 
and independent of teleportation rates for undirected and 
well-mixed networks, and for all the real-world networks 
we have analyzed, smart teleportation is as good as or 
better than standard teleportation. 

When clustering networks with Infomap based on the 
movements of a random walker on the network, not 
recording the teleportation steps makes the results of 



real- world networks dramatically more robust. Because 
smart teleportation eliminates mixing between network 
communities, results of benchmark networks are practi- 
cally independent of the teleportation rate. The advan- 
tages of smart teleportation over standard teleportation 
makes it interesting to explore the benefits in other flow- 
based clustering algorithms and variations of PagcRank. 
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